Databases

Definitions

Database: An organised collection of data
Relational database: Database made up of a collection of related tables (relations)
Entity: A category of object
Attributes: Properties of an entity
Entity descriptions: Written as entity_name(attribute1, attribute2...). The entity identifier or primary key is underlined.
Entity identifier / primary key: Each entity must have attribute to uniquely identify each record - the primary key (underlined in the entity description) Sometimes a combination of non-unique attributes can be combined to make a composite unique identifier.
Foreign keys are used to form relationships between two entities. It is an attribute common to both entities. Foreign keys are italicised.

Normalisation

Normalisation is a process to ensure that data in a relational database is stored in most efficient way. In a normalised database, there is no redundant data, each item of data is stored in correct table and is at an atomic level. Atomic means split to a level where it cannot be further decomposed.

First normal form

Tables do not contain repeating attributes or groups
Data is atomic 
Will sometimes result in a composite primary key.

Second normal form

1NF must be achieved first; a database in 1NF may already be in 2NF 
Every non-primary key attribute must depend on the whole of the primary key 
Relations may be split into 2 or more relations.

Third normal form

2NF must be achieved first ; a database in 2NF may already be in 3NF 
Every non-primary key attribute must be solely dependent on the primary key to separate relations - it can't be dependent on any other attribute
Relations may be split into 2 or more relations.

String data types

CHAR

Fixed length
Any remaining space in the field is padded with blanks
1 byte per character
CHAR(100) means it can hold maximum 100 characters, e.g. 'abc' would take up 100 bytes.

VARCHAR

Variable length
Holds only the characters assigned to it
1 byte per character plus 1-2 bytes to hold length information
VARCHAR(100) means it can hold maximum 100 characters, e.g. ‘abc’ would take up 5 bytes.

SQL

SQL commands use the following syntax:

SELECT <attribute> FROM <table> WHERE <condition> ORDER BY <ASC/DESC>
UPDATE <table> SET <attribute> = <value> WHERE <attribute> = <value>
DELETE FROM <table> WHERE <condition>
INSERT INTO <table> (<column1>, <column2>, ...) VALUES (<value1>, <value2>, ...)
CREATE TABLE <table>(<column1> <column1type>, <column2> <column2type>, ...)

Transactions

All transactions must satisfy:

Atomicity - either the whole transaction is executed or none of it.
Consistency - integrity of the DB must be maintained so that the database is consistent before and after the transaction. Refers to correctness of a database.
Isolation - transaction can take place concurrently without leading to inconsistencies in the database. Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed. This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.
Durability - Changes that are committed must be persistent (permanent) and not lost even if there is a system failure.

Record locks

A lock is put on an item data as soon as a user with write-access starts using it. There are two types of lock:

A shared lock for reading, which any number of transactions can hold at once.
An exclusive lock for writing, which only one transaction can hold at once.

While a record is exclusive locked, other users cannot access the record. This means two users cannot concurrently update a record; users must wait until other transactions have released the record with the COMMIT command. A downside of using locks is that deadlocks can occur, where two transactions wait for each other to release records.

Serialisation

DBMS will ensure that only one transaction at a time is executed on a record. It will work out which transactions need serialising and provide schedule for this.
A user cannot see uncommitted changes that were made by another user or changes committed after it began.
A transaction cannot modify data that has been modified by concurrent transaction. If this is attempted, the transaction is rolled back and restarted.

Timestamp ordering

Each transaction has a timestamp when it starts.
Each piece of data has read and write timestamp.
Until transactions are committed, they are registered as temporary and can be rolled-back if they will cause concurrency violation.
Temporary changes are used to check for concurrency violations.
Two transactions can be started but any transaction can only be allowed to commit if the read and write timestamps for data are <= transaction timestamp.

Commitment ordering

Transactions are all-or-nothing (atomic).
If two transactions are in conflict DBMS will decide which transaction should take precedence, the other is aborted and re-started.
Commitment ordering is used where transactions are created on a client device then transferred to a server, such as when using mobile devices where the connection to the server may be intermittent. The client device may only be connected to the server at the start and end of the transaction.